83 research outputs found

    Building and Tracking Hierarchical Geographical & Temporal Partitions for Image Collection Management on Mobile Devices

    Get PDF
    International audienceUsage of mobile devices (phones, digital cameras) raises the need for organizing large personal image collections. In accordance with studies on user needs, we propose a statistical criterion and an associated optimization technique, relying on geo-temporal image metadata, for building and tracking a hierarchical structure on the image collection. In a mixture model framework, particularities of the application and typical data sets are taken into account in the design of the scheme (incrementality, ability to cope with non-Gaussian data, with both small and large samples). Results are reported on real data sets

    Gossip-based computation of a Gaussian mixture model for distributed multimedia indexing

    Get PDF
    International audienceThe present paper deals with pattern recognition in a distributed computing context of the peer-to-peer type, that should be more and more interesting for multimedia data indexing and retrieval. Our goal is estimating of classconditional probability densities, that take the form of Gaussian mixture models (GMM). Originally, we propagate GMMs in a decentralized fashion (gossip) in a network, and aggregate GMMs from various sources, through a technique that only involves little computation and that makes parcimonious usage of the network resource, as model parameters rather than data are transmitted. The aggregation is based on iterative optimization of an approximation of a KL divergence allowing closed-form computation between mixture models. Experimental results demonstrate the scheme to the case of speaker recognition

    An algebraic approach to ensemble clustering

    Get PDF
    International audienceIn clustering, consensus clustering aims at providing a single partition fitting a consensus from a set of independently generated. Common procedures, which are mainly statistical and graph-based, are recognized for their robustness and ability to scale-up. In this paper, we provide a complementary and original viewpoint over consensus clustering, by means of algebraic definitions which allow to ascertain the nature of available inferences in a systematic approach (e.g. a knowledge base). We found our approach on the lattice of partitions, for which we shall disclose how some operators can be added with the aim to express a formula representing the consensus. We show that adopting an incremental approach may assist to retain significant amount of aggregate data which fits well with the set of input clusterings. Beyond that ability to model formulae, we also note that its potential cannot be easily captured through such a logical system. It is due to the volatile nature of handling partitions which finally impacts on ability to draw some valuable conclusions

    A low-cost variational-Bayes technique for merging mixtures of probabilistic principal component analyzers

    Get PDF
    International audienceMixtures of probabilistic principal component analyzers (MPPCA) have shown effective for modeling high-dimensional data sets living on nonlinear manifolds. Briefly stated, they conduct mixture model estimation and dimensionality reduction through a single process. This paper makes two contributions: first, we disclose a Bayesian technique for estimating such mixture models. Then, assuming several MPPCA models are available, we address the problem of aggregating them into a single MPPCA model, which should be as parsimonious as possible. We disclose in detail how this can be achieved in a cost-effective way, without sampling nor access to data, but solely requiring mixture parameters. The proposed approach is based on a novel variational-Bayes scheme operating over model parameters. Numerous experimental results and discussion are provided

    A decentralized and robust approach to estimating a probabilistic mixture model for structuring distributed data

    Get PDF
    International audienceData sharing services on the web host huge amounts of resources supplied and accessed by millions of users around the world. While the classical approach is a central control over the data set, even if this data set is distributed, there is growing interesting in decentralized solutions, because of good properties (in particularity, privacy and scaling up). In this paper, we explore a machine learning side of this work direction. We propose a novel technique for decentralized estimation of probabilistic mixture models, which are among the most versatile generative models for understanding data sets. More precisely, we demonstrate how to estimate a global mixture model from a set of local models. Our approach accommodates dynamic topology and data sources and is statistically robust, i.e. resilient to the presence of unreliable local models. Such outlier models may arise from local data which are outliers, compared to the global trend, or poor mixture estimation. We report experiments on synthetic data and real geo-location data from Flickr

    Fast aggregation of Student mixture models

    Get PDF
    International audienceThis paper deals with probabilistic models, that take the form of mixtures of Student distributions. Student distributions are known to be more statistically robust than Gaussian distributions, with regard to outliers (i.e. data that cannot be reasonnably explained by any component in the mixture and that do not justifiy an extra component. Our contribution is as follows : we show how several mixtures of Student distributions may be agregated into a single mixture, without resorting to sampling. The trick is that, as is well known, a Student distribution may be expressed as an infinite mixture of Gaussians, where the variances follow a Gamma distribution

    Organisation statistique spatio-temporelle d'une collection d'images acquises d'un terminal mobile géolocalisé

    Get PDF
    International audienceNous présentons une technique automatique d'organisation de collection d'images personnelles, pour répondre aux besoins particuliers émergents des téléphones portables équipés d'appareil photographique. Après avoir examiné ce qui fait la particularit é de ce contexte, nous proposons une technique de structuration de collection d'image basée sur la date et le lieu de prise de vue des images. L'objectif est formalisé comme un problème de classification non-supervisée, temporelle et spatiale. Le critère statistique de vraisemblance complétée intégrée (ICL) est retenu, car il fournit une solution efficace pour déterminer la complexité du modèle et un bon niveau de séparabilité de ses composantes, tout en limitant le caractère arbitraire de la paramétrisation. La fiabilité des classifications obtenues est ensuite évaluée, afin d'en sélectionner la plus pertinente, pour fournir une structure utilisable avec une interface de type calendrier électronique permettant d'explorer la collection

    Aggregation of probabilistic PCA mixtures with a variational-Bayes technique over parameters

    Get PDF
    International audienceThis paper proposes a solution to the problem of aggre- gating versatile probabilistic models, namely mixtures of probabilistic principal component analyzers. These models are a powerful generative form for capturing high-dimensional, non Gaussian, data. They simulta- neously perform mixture adjustment and dimensional- ity reduction. We demonstrate how such models may be advantageously aggregated by accessing mixture pa- rameters only, rather than original data. Aggregation is carried out through Bayesian estimation with a specific prior and an original variational scheme. Experimental results illustrate the effectiveness of the proposal

    Building and Tracking Hierarchical Geographical & Temporal Partitions for Image Collection Management on Mobile Devices

    Get PDF
    International audienceUsage of mobile devices (phones, digital cameras) raises the need for organizing large personal image collections. In accordance with studies on user needs, we propose a statistical criterion and an associated optimization technique, relying on geo-temporal image metadata, for building and tracking a hierarchical structure on the image collection. In a mixture model framework, particularities of the application and typical data sets are taken into account in the design of the scheme (incrementality, ability to cope with non-Gaussian data, with both small and large samples). Results are reported on real data sets
    • …
    corecore